- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5.2k
Tar: fix handing of null terminated fields. #117410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The current implementation is trimming spaces and nulls from the end. Instead we need to find the terminating null from the start, and we shouldn't trim spaces at the end.
| Tagging subscribers to this area: @dotnet/area-system-formats-tar | 
| I have not included a test because the PR is removing some special handling and not introducing any. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates how null-terminated fields are handled by stopping at the first null byte rather than trimming trailing nulls and spaces, and it adjusts all consumers accordingly.
- Introduced TrimNullTerminatedto handle null-terminated spans
- Updated ParseOctalto use the new trimming logic and only ignore spaces
- Replaced all GetTrimmed*calls inTarHeader.Read.cswithParseUtf8String
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description | 
|---|---|
| TarHelpers.cs | Added TrimNullTerminated, updatedParseOctalandParseUtf8String, and removed old helpers | 
| TarHeader.Read.cs | Replaced GetTrimmedUtf8Stringcalls withParseUtf8String | 
Comments suppressed due to low confidence (3)
src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarHelpers.cs:274
- Add an XML doc comment for ParseUtf8Stringexplaining that it returns the UTF-8 string up to the first null byte without trimming spaces.
        internal static string ParseUtf8String(ReadOnlySpan<byte> buffer)
src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarHelpers.cs:280
- Add an XML doc comment for TrimNullTerminatedto clarify that it slices the span at the first null byte and leaves trailing spaces intact.
        internal static ReadOnlySpan<byte> TrimNullTerminated(ReadOnlySpan<byte> buffer)
src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarHelpers.cs:243
- Add unit tests for ParseOctalandParseUtf8Stringcovering scenarios with fields that include trailing spaces after the null terminator and fields with no null byte.
            buffer = TrimNullTerminated(buffer);
| 
 We should have a test that validates that we trim correctly since we did that wrong in an observable way before. So a tar with an embedded null should truncate the name at the null. You can add such a file to https://github.com/dotnet/runtime-assets/tree/main/src/System.Formats.Tar.TestData or you could produce a tar and then modify the bytes. | 
| I'll add a test case. Also, CI found some failing tests which I'll need to look into. | 
…aOffset_RegularFile_SecondEntry test to fail.
        
          
                src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarReader.cs
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | 
 @ericstj I've created an archive that looks like: Note that the filename is  The test will extract this entry and validate the name is  Is this sufficient? To use this archive in a test, I need to add it to the https://github.com/dotnet/runtime-assets repo? | 
| 
 You can either 
 We have cases of all of these. My preference would be 2 since it's most readable, but not sure if that's possible. Since it's so small I think 3 is acceptable and avoids the need to modify runtime-assets. | 
| 
 I think this can work. I'll try tomorrow. | 
| @ericstj I've added a test. Can you take another look? | 
The current implementation is trimming spaces and nulls from the end.
Instead we need to find the terminating null from the start, and we shouldn't trim spaces at the end.
Fixes #117331.
@ericstj @ViktorHofer ptal.
cc @bruce965